contributor | Anwendersoftware (IPVR) | ||||||||||||
creator |
Rantzau, Ralf
| date |
2003-06
| description |
SQL-based data mining algorithms are rarely used in practice today.
Most performance experiments have shown that SQL-based approaches
are inferior to main-memory algorithms. Nevertheless, database
vendors try to integrate analysis functionalities to some extent
into their query execution and optimization components in order to
narrow the gap between data and processing. Such a database support
is particularly important when data mining applications need to
analyze very large datasets or when they need access current data,
not a possibly outdated copy of it.
We investigate approaches based on SQL for the problem of finding
frequent itemsets in a transaction table, including an algorithm
that we recently proposed, called Quiver, which employs universal
and existential quantifications. This approach employs a table
schema for itemsets that is similar to the commonly used vertical
layout for transactions: each item of an itemset is stored in a
separate row. We argue that expressing the frequent itemset
discovery problem using quantifications offers interesting
opportunities to process such queries using set containment join or
set containment division operators, which are not yet available in
commercial database systems. Initial performance experiments reveal
that Quiver cannot be processed efficiently by commercial DBMS.
However, our experiments with query execution plans that use
operators realizing set containment tests suggest that an efficient
processing of Quiver is possible.
| format |
application/pdf
| 202152 Bytes | |
identifier | http://www.informatik.uni-stuttgart.de/cgi-bin/NCSTRL/NCSTRL_view.pl?id=INPROC-2003-02&engl=1 |
language | eng |
publisher | Rensselaer Polytechnic Institute, Troy, New York 12180-3590, USA |
relation | Report No. 03-05 |
source | In: Zaki, Mohammed (ed.); Aggarwal, Charu (ed.): Proceedings of the ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery (DMKD), San Diego, California, USA, June 13, 2003, pp. 20-27 |
ftp://ftp.informatik.uni-stuttgart.de/pub/library/ncstrl.ustuttgart_fi/INPROC-2003-02/INPROC-2003-02.pdf | |
subject | Database Management Systems (CR H.2.4) |
Database Applications (CR H.2.8) | |
association rule discovery | |
relational division | |
set containment join | |
title | Processing Frequent Itemset Discovery Queries by Division and Set Containment Join Operators |
type | Text |
Article in Proceedings |